Insights into the Thermally Activated Cyclization Mechanism in a Linear Phenylalanine-Alanine Dipeptide

Dipeptides, the prototype peptides, exist in both linear (l-) and cyclo (c-) structures. Since the first mass spectrometry experiments, it has been observed that some l-structures may turn into the cyclo ones, likely via a temperature-induced process. In this work, combining several different experimental techniques (mass spectrometry, infrared and Raman spectroscopy, and thermogravimetric analysis) with tight-binding and ab initio simulations, we provide evidence that, in the case of l-phenylalanyl-l-alanine, an irreversible cyclization mechanism, catalyzed by water and driven by temperature, occurs in the condensed phase. This process can be considered as a very efficient strategy to improve dipeptide stability by turning the comparatively fragile linear structure into the robust and more stable cyclic one. This mechanism may have played a role in prebiotic chemistry and can be further exploited in the preparation of nanomaterials and drugs.


Thermogravimetric Analysis
The Thermogravimetric and Differential Thermal Analysis (TG-DTA) measurements performed on the l-PheAla sample over the temperature range from room temperature up to 800 °C are shown in Figure S1. The DTA results show four endothermic enthalpy peaks (measured as heat flow in mW) centered at ≈ 105, 135, 260 and 362 °C, respectively. The mass losses (%) and the related errors, as well as the temperature ranges (reported in Figure S1) have been estimated by a deconvolution method applied on the derivative of the weight curve. Through this operation, five different steps of mass loss have been identified. The fitting procedure, performed using Voigt functions, indicates that the first two steps of mass loss overlap, and correspond to a total weight loss of about 16%. Also the last two steps are partially overlapped and correspond to the higher contributions to the mass loss, 31% and 40%, with onset points at ≈ 300 and 314 °C, respectively. More in detail, the first endothermic peak in the temperature range 80-130 °C is related to a first weight loss of about 10%. The second step of mass loss (≈ 6%) with onset point at 120 °C partially overlaps with the first one and is accompanied by another endothermic enthalpy peak. Further heating the sample (above 200 °C) leads to the appearance of a third prominent endothermic peak (onset point ≈ 240 °C) associated to a small step of mass loss of (2±1)%. In the temperature range 300-380 °C there is a dramatic mass decrease of ≈ 31%, without noticeable enthalpy changes. Finally, partially overlapped with the previous one, a 40% weight loss is observed between 314 and 400 °C, in concomitance with the last endothermic peak. The presence of two main steps of mass loss suggests that the residual sample may undergo a "two steps sublimation" between 270 and 400 °C with a total weight loss of 71%.
On the ground of the hypothesis of a cyclization mechanism of the l-PheAla molecule in the condensed phase 1-4 , we propose the following interpretation of the TG-DTA results. The mass loss of ≈ (6±2)% observed at 120 °C may be consistent with the mass ratio of water molecule and l-PheAla of 7.6% expected for a 100% efficient 'intramolecular' water-emission process. The cyclization hypothesis may also explain the two steps of mass loss above 300 °C. Assuming the formation of the cyclo species (c-PheAla, 218 amu), one may suppose that the most probable dissociation S3 mechanism of this species involves the loss of the phenyl ring with the formation of two main fragments with m = 91 (benzyl radical) and 127 amu. The mass ratio of 0.42 for m = 91 amu is in good agreement with the 0.44 ratio of the first weight loss (31%) with respect to the total mass loss of 71% in the range 300-400 °C. For the last step (40%) the ratio is equal to 0.56 which is consistent with 0.58 mass ratio of m = 127 amu with respect to the c-PheAla molecule (218 amu). The last endothermic peak related to this step could be due to another structural rearrangement during the "two steps sublimation" of the residual sample.

IR and Raman results
The comprehensive theoretical assignments of the IR and Raman spectra calculated on the optimized structures of the linear neutral (l-neu), cyclo neutral (c-neu) and linear zwitterion (l-zwi) PheAla molecule are reported in Table S1, S2 and S3, respectively. The region between 1250 and 1650 cm -1 is characterized by very significant differences between measurements collected at room temperature on the pristine sample (l-PheAla) and on the residual powder after heating at the working temperature for mass spectrometry measurements (r-PheAla). Such differences are very well reproduced by theoretical calculations of l-zwi and c-neu structures, respectively, as shown in Figure S2.

Figure S2. Comparison between the experimental IR spectra of l-PheAla (black curve) and r-PheAla (red curve) and between the simulated IR spectra of l-zwi (black) and c-neu (red) structures, on the left and right side respectively, in the range 1650-1250 cm -1 . The main contribution of COO and NH 3 vibrations is indicated.
Before heating, several contributions are grouped in large spectral features. The simulations of the l-zwi structure (black curve) suggests that the high energy band, roughly between 1450 and 1650 cm -1 , receives contributions from the asymmetric stretching of the terminal COO group (1614 cm -1 ) and three groups of bands involving NH 3 bending (1610, 1490 and 1486 cm -1 ), NH bending (1538 cm -1 ) and Phenyl CH bending (1490 and 1486 cm -1 ), as also detailed in Table S3. A weaker and lower-energy group of bands between 1300 and 1450 cm -1 contains as main contributions the symmetric stretching of the COO group (1407 cm -1 ), accompanied by CH bending modes of the methyl (1376 cm -1 ) and methyne (1355 cm -1 ) groups. These contributions are drastically reduced after cyclization, due to the very different dynamical behavior of the six-membered ring, where the COO group is not present anymore, with respect to the linear structure. In the c-neu simulation (red curve) the former among two surviving bands can be assigned to the peculiar stretching mode of CC and CN bonds in the diketopiperazine ring (1452 cm -1 ), accompanied by the CH bending of methylene (1476 cm -1 ) and methyl (1478 cm -1 , reinforced by the interaction with the -CH 2 -Phenyl group, that in the case of c-neu tends to fold toward CH 3 ) groups. The latter band is also dominated by CC and CN stretching modes (1347 cm -1 ) involving both diketopiperazine and phenyl rings.
The experimental IR measurements collected at room temperature on the c-GlyPhe and c-AlaGly samples are reported and compared with the results obtained on the r-PheAla sample. In the calculations the cyclo neutral (c-neu) structures of the isolated PheAla molecule and the c-neu structures of GlyPhe and AlaGly molecules have been considered. The comparison between experimental and theoretical results is shown in Figure S3.  The comparison between the simulated and experimental IR spectra showed in Figure S3 indicates that these cyclo dipeptides are characterized by the same fingerprints of the simulated spectrum for c-neu(PheAla), and the measured spectrum of r-PheAla. The analysis of vibrational IR bands is consistent with the predictions for c-neutral structures of these molecules. Measurement and simulation of IR spectra of c-AlaGly and c-GlyPhe, show striking similarities with r-PheAla. In the following, the characteristic IR vibrational modes compatible with a cyclo-dipeptide structure are reported and compared for the three samples. The assignment in the PheAla Raman spectra of the characteristic lines of the phenyl-CH 2 -group of the structures l-neu, l-zwi and c-neu (see figure S4) has been done by a close comparison between the spectra of c-GlyPhe (containing the phenyl-CH 2 -group) and c-AlaGly (not containing the group). The comparison between the experimental Raman spectra of r-PheAla, c-AlaGly and c-GlyPhe samples and the simulated spectra of c-neu(PheAla), c-neu(AlaGly) and c-neu(GlyPhe) is shown in Figure S5.   Figure S6 summarizes the results of a preliminary investigation of the reaction barriers applied to the cyclization of a linear, neutral GlyGly molecule, already investigated in a previous study 15 . We note that stable structures and reaction paths have been obtained independently from such reported by Li et al., which are also indicated in Figure S6, printed in red. Despite of the indipendent approach and the slighty different theoretical framework used in the two studies, the results are qualitatively very similar, and show a quantitative agreement close enough to provide a solid ground to apply the same method to the cyclization of PheAla, not investigated in previous studies.

S18
A preliminary analysis of the effect of the molecular aggregation of l-PheAla in the solid state, and of its effect on dipeptide cyclization, has been performed and will be briefly discussed here. We focus on a system containing four l-PheAla molecules and only one water molecule (see Figure  S7), thus investigating only effects due to inter-peptide interactions and limiting the fluctuation due to an enhanced catalytic effect of the latter species. A first assessment of the system has been obtained using the xTB-GFN2 Hamiltonian and the CREST sorting tool, as discussed in the main text. However, DFT investigation at the M062X@def2-TZVPP level of theory is quite expensive in terms of computational resources. For this reason, we fell back to the robust r 2 -SCAN-3c functional with its tailored mTZVPP basis set 16 , whose application to the one-PheAla+H 2 O mechanism (see Figure S7, upper part) provides results close to those obtained using M062X and reported in Figure 7 of the main text. Regarding the four-PheAla+H 2 O mechanism ( Figure S7, lower part), the water catalyst finds a third anchoring point in a strong H bond with a neighboring -COOH group, leading to a significant lowering of the barrier to cyclization, accompanied to the formation of the -C(OH) 2 intermediate. No particular effect is reported in the case of the reaction barrier leading to the water-catalyzed elimination of a second water molecule from the -C(OH) 2 intermediate. This is likely due to the fact that this second reaction occurs on the outskirts of the simple four-molecule system employed, where the catalyst is not anymore coordinated to a neighboring dipeptide molecule, leading to results in line with previous findings. These results also suggest that a more complex and thick network of interactions, whose theoretical representation is clearly beyond the scope of this preliminary study, is required to theoretically unravel the massive rearrangement process which takes place when the real sample is heated at 85 °C.